1 Introduction - Gas Station Data

In this analysis, we will be looking at gas station data which consists of 72,798 records and 30 feature variables with 1 binary outcome variable. A map was created based on a random sample of 500 gas stations in the US.

2 Data Preparation

Importing the data set:

poc<-read.csv(url("https://melaniemiller1.github.io/STA533/w07-HW/POC.csv"))

Take random sample of 500 gas stations in the US:

sample500<- poc[sample(nrow(poc), 500), ]

3 Gas Stations in the US - leaflet

label.msg <- ~paste(sample500$ADDRESS,",", sample500$county, sample500$STATE,",",sample500$ZIPnew)

leaflet(sample500) %>%
  addTiles() %>% 
  setView(lng=mean(sample500$xcoord), lat=mean(sample500$ycoord), zoom = 14) %>%
   addRectangles(
    lng1 = min(sample500$xcoord), lat1 = min(sample500$ycoord),
    lng2 = max(sample500$xcoord), lat2 = max(sample500$ycoord),
    fillColor = "transparent" 
    ) %>%
  fitBounds(
    lng1 = min(sample500$xcoord), lat1 = min(sample500$ycoord),
    lng2 = max(sample500$xcoord), lat2 = max(sample500$ycoord) ) %>%
  addMarkers(~xcoord, ~ycoord, label = label.msg)

From this plot of a random sample of gas stations in the US, the sample shows that there is a higher proportion of gas stations in the east coast. There is a lot of sparse space in the midwest, while there is hardly any areas on the east coast without any gas stations.

4 Gas Stations in the US - plotly

#library(plotly)

g <- list(      scope = 'usa',
           projection = list(type = 'albers usa'),
             showland = TRUE,
            landcolor = toRGB("gray95"),
         subunitcolor = toRGB("gray85"),
         countrycolor = toRGB("gray85"),
         countrywidth = 0.5,
         subunitwidth = 0.5
       )
###
fig <- plot_geo(sample500, lat = ~ycoord, lon = ~xcoord) %>% 
  add_markers( text = ~paste(STATE, county, ADDRESS, ZIPnew, 
                             sep = "<br>"),
  
              hoverinfo = "text")   %>% 

  layout( title = 'Gas Stations in the United States', 
          geo = g )

fig

Here is another example of plotting the same data in a different mapping format.

5 Philly Crime Data

In this analysis, we will be looking at Philly Crime data which consists of 15,520 records and 18 variables and contains crime cases since 2015.A map was created based on a subset of data only in 2023.

6 Data Preparation

Importing Philly Crime Since 2015 data

phillycrime<-read.csv(url("https://melaniemiller1.github.io/STA533/w07-HW/PhillyCrimeSince2015.csv"))

Extract information of year from the variable date and then add the new variable year to the data set

year<- format(as.Date(phillycrime$date, format="%m/%d/%Y"),"%Y")

phillycrime<- cbind(phillycrime, year)

Subset only containing 2023 data

philly2023<- phillycrime %>%
             filter(year == 2023)

7 Map of Crime in Philly in 2023

pal <- c("orange", "navy")
pal[which(philly2023$fatal=="Nonfatal")] <- "navy"
pal[which(philly2023$fatal=="Fatal")] <- "orange"

label.msg <- paste("Neighborhood:", philly2023$neighborhood,
                   "<br> Race:",philly2023$race)



leaflet(philly2023) %>% 
  addTiles() %>%
  setView(lng=-75.1652, lat=39.9526, zoom = 10.5) %>%
  
    addCircleMarkers(
            ~lng, 
            ~lat,
            color = pal,
            stroke = FALSE, 
            fillOpacity = 0.4,
            label = ~paste("Neighborhood:", neighborhood, 
                           "Sex:", sex, 
                           "Race:", race, 
                           "Age:", age))  %>%

  addLegend(position = "bottomright", 
            colors = c("orange", "navy"),
            labels= c("Fatal", "Nonfatal"),
            title= "Fatal",
            opacity = 0.4)  

This map shows the crime in Philly in 2023. The navy points are nonfatal crimes, while the orange are fatal crimes. BY looking at the map visually, it appears there are more nonfatal crimes in 2023, however there are still many prominent fatal crimes.

---
title: "STA 533 Homework 7"
author: "Melanie Miller"
date: "West Chester University"
output: 
  html_document:
    toc: yes
    toc_float: yes
    number_sections: yes
    toc_collapsed: yes
    code_folding: hide
    code_download: yes
    smooth_scroll: true
    theme: lumen
editor_options:
  chunk_output_type: inline
---


<style type="text/css">

/* Table of content - navigation */
div#TOC li {
    list-style:none;
    background-color:lightgray;
    background-image:none;
    background-repeat:none;
    background-position:0;
    font-family: Arial, Helvetica, sans-serif;
    color: #780c0c;
}


/* Title fonts */
h1.title {
  font-size: 24px;
  color: darkblue;
  text-align: center;
  font-family: Arial, Helvetica, sans-serif;
  font-variant-caps: normal;
}
h4.author { 
  font-size: 18px;
  font-family: Arial, Helvetica, sans-serif;
  color: navy;
  text-align: center;
}
h4.date { 
  font-size: 18px;
  font-family: Arial, Helvetica, sans-serif;
  color: darkblue;
  text-align: center;
}

/* Section headers */
h1 {
    font-size: 22px;
    font-family: "Times New Roman", Times, serif;
    color: darkred;
    text-align: left;
}

h2 {
    font-size: 18px;
    font-family: "Times New Roman", Times, serif;
    color: navy;
    text-align: left;
}

h3 { 
    font-size: 15px;
    font-family: "Times New Roman", Times, serif;
    color: darkred;
    text-align: left;
}

h4 {
    font-size: 18px;
    font-family: "Times New Roman", Times, serif;
    color: darkred;
    text-align: left;
}
</style>


```{r setup, include=FALSE}
# code chunk specifies whether the R code, warnings, and output 
# will be included in the output files.
if (!require("tidyverse")) {
   install.packages("tidyverse")
   library(tidyverse)
}
if (!require("knitr")) {
   install.packages("knitr")
   library(knitr)
}
if (!require("jpeg")) {
   install.packages("jpeg", dependencies = TRUE)
   library(jpeg)
}

if (!require("RCurl")) {
   install.packages("RCurl", dependencies = TRUE)
   library(RCurl)
}

if (!require("plotly")) {
   install.packages("plotly", dependencies = TRUE)
   library(plotly)
}
if (!require("leaflet")) {
    install.packages("leaflet")              
    library("leaflet")
}
if (!require("tmap")) {
    install.packages("tmap")              
    library("tmap")
}

knitr::opts_chunk$set(echo = TRUE,       
                      warning = FALSE,   
                      result = TRUE,   
                      message = FALSE,
                      comment = NA)
```


# Introduction - Gas Station Data
In this analysis, we will be looking at gas station data which consists of 72,798 records and 30 feature variables with 1 binary outcome variable. A map was created based on a random sample of 500 gas stations in the US.


# Data Preparation

Importing the data set:
```{r}
poc<-read.csv(url("https://melaniemiller1.github.io/STA533/w07-HW/POC.csv"))
```


Take random sample of 500 gas stations in the US:
```{r}
sample500<- poc[sample(nrow(poc), 500), ]
```


# Gas Stations in the US - leaflet
```{r}
label.msg <- ~paste(sample500$ADDRESS,",", sample500$county, sample500$STATE,",",sample500$ZIPnew)

leaflet(sample500) %>%
  addTiles() %>% 
  setView(lng=mean(sample500$xcoord), lat=mean(sample500$ycoord), zoom = 14) %>%
   addRectangles(
    lng1 = min(sample500$xcoord), lat1 = min(sample500$ycoord),
    lng2 = max(sample500$xcoord), lat2 = max(sample500$ycoord),
    fillColor = "transparent" 
    ) %>%
  fitBounds(
    lng1 = min(sample500$xcoord), lat1 = min(sample500$ycoord),
    lng2 = max(sample500$xcoord), lat2 = max(sample500$ycoord) ) %>%
  addMarkers(~xcoord, ~ycoord, label = label.msg)
```

From this plot of a random sample of gas stations in the US, the sample shows that there is a higher proportion of gas stations in the east coast. There is a lot of sparse space in the midwest, while there is hardly any areas on the east coast without any gas stations. 


# Gas Stations in the US - plotly
```{r}
#library(plotly)

g <- list(      scope = 'usa',
           projection = list(type = 'albers usa'),
             showland = TRUE,
            landcolor = toRGB("gray95"),
         subunitcolor = toRGB("gray85"),
         countrycolor = toRGB("gray85"),
         countrywidth = 0.5,
         subunitwidth = 0.5
       )
###
fig <- plot_geo(sample500, lat = ~ycoord, lon = ~xcoord) %>% 
  add_markers( text = ~paste(STATE, county, ADDRESS, ZIPnew, 
                             sep = "<br>"),
  
              hoverinfo = "text")   %>% 

  layout( title = 'Gas Stations in the United States', 
          geo = g )

fig
```

Here is another example of plotting the same data in a different mapping format.



# Philly Crime Data
In this analysis, we will be looking at Philly Crime data which consists of 15,520 records and 18 variables and contains crime cases since 2015.A map was created based on a subset of data only in 2023.


# Data Preparation
Importing Philly Crime Since 2015 data
```{r}
phillycrime<-read.csv(url("https://melaniemiller1.github.io/STA533/w07-HW/PhillyCrimeSince2015.csv"))
```


Extract information of year from the variable date and then add the new variable year to the data set
```{r}
year<- format(as.Date(phillycrime$date, format="%m/%d/%Y"),"%Y")

phillycrime<- cbind(phillycrime, year)
```


Subset only containing 2023 data
```{r}
philly2023<- phillycrime %>%
             filter(year == 2023)
```


# Map of Crime in Philly in 2023

```{r}
pal <- c("orange", "navy")
pal[which(philly2023$fatal=="Nonfatal")] <- "navy"
pal[which(philly2023$fatal=="Fatal")] <- "orange"

label.msg <- paste("Neighborhood:", philly2023$neighborhood,
                   "<br> Race:",philly2023$race)



leaflet(philly2023) %>% 
  addTiles() %>%
  setView(lng=-75.1652, lat=39.9526, zoom = 10.5) %>%
  
    addCircleMarkers(
            ~lng, 
            ~lat,
            color = pal,
            stroke = FALSE, 
            fillOpacity = 0.4,
            label = ~paste("Neighborhood:", neighborhood, 
                           "Sex:", sex, 
                           "Race:", race, 
                           "Age:", age))  %>%

  addLegend(position = "bottomright", 
            colors = c("orange", "navy"),
            labels= c("Fatal", "Nonfatal"),
            title= "Fatal",
            opacity = 0.4)  


```


This map shows the crime in Philly in 2023. The navy points are nonfatal crimes, while the orange are fatal crimes. BY looking at the map visually, it appears there are more nonfatal crimes in 2023, however there are still many prominent fatal crimes. 


















